{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Sequence Types"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
},
"tags": [
"remove-cell"
]
},
"source": [
"**CS1302 Introduction to Computer Programming**\n",
"___"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-27T11:20:04.656873Z",
"start_time": "2020-11-27T11:20:04.651575Z"
},
"slideshow": {
"slide_type": "fragment"
},
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"import random\n",
"\n",
"%reload_ext mytutor"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Motivation of composite data type"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"The following code calculates the average of five numbers:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2021-03-20T14:52:00.626044Z",
"start_time": "2021-03-20T14:52:00.608190Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"3.0"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def average_five_numbers(n1, n2, n3, n4, n5):\n",
" return (n1 + n2 + n3 + n4 + n5) / 5\n",
"\n",
"\n",
"average_five_numbers(1, 2, 3, 4, 5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"What about using the above function to compute the average household income in Hong Kong. \n",
"The labor size in Hong Kong is close to [4 million](https://www.gov.hk/en/about/abouthk/factsheets/docs/employment.pdf).\n",
"- Should we create a variable to store the income of each individual?\n",
"- Should we recursively apply the function to groups of five numbers?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"What we need is\n",
"- a *composite data type* that can keep a variable number of items, so that \n",
"- we can then define a function that takes an object of the *composite data type*,\n",
"- and returns the average of all items in the object."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to store a sequence of items in Python?**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We learned a composite data type that stores a sequence of characters. What is it?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"`tuple` and `list` are two other built-in sequence types for ordered collections of objects. Unlike string, they can store items of possibly different types."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Indeed, we have already used tuples and lists before."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:25:35.106582Z",
"start_time": "2020-11-02T23:25:35.101478Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%mytutor -h 300\n",
"a_list = \"1 2 3\".split()\n",
"a_tuple = (lambda *args: args)(1, 2, 3)\n",
"a_list[0] = 0\n",
"a_tuple[0] = 0"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**What is the difference between tuple and list?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"- List is [*mutable*](https://docs.python.org/3/library/stdtypes.html#index-21) so programmers can change its items.\n",
"- Tuple is [*immutable*](https://docs.python.org/3/glossary.html#term-immutable) like `int`, `float`, and `str`, so\n",
" - programmers can be certain the content stay unchanged, and\n",
" - Python can preallocate a fixed amount of memory to store its content."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Constructing sequences"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to create tuple/list?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Mathematicians often represent a set of items in two different ways:\n",
"1. [Roster notation](https://en.wikipedia.org/wiki/Set_(mathematics)#Roster_notation), which enumerates the elements in the sequence, e.g.,\n",
"\n",
"$$ \\{0, 1, 4, 9, 16, 25, 36, 49, 64, 81\\} $$\n",
"\n",
"2. [Set-builder notation](https://en.wikipedia.org/wiki/Set-builder_notation), which describes the content using a rule for constructing the elements, e.g.,\n",
"\n",
"$$ \\{x^2| x\\in \\mathbb{N}, x< 10 \\}, $$\n",
"\n",
"namely the set of perfect squares less than 100."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Python also provides two corresponding ways to create a tuple/list: \n",
"1. [Enclosure](https://docs.python.org/3/reference/expressions.html?highlight=literals#grammar-token-enclosure)\n",
"2. [Comprehension](https://docs.python.org/3/reference/expressions.html#index-12)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to create a tuple/list by enumerating its items?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"To create a tuple, we enclose a comma separated sequence by parentheses:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:27:26.558639Z",
"start_time": "2020-11-02T23:27:26.554769Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%mytutor -h 450\n",
"empty_tuple = ()\n",
"singleton_tuple = (0,) # why not (0)?\n",
"heterogeneous_tuple = (singleton_tuple, (1, 2.0), print)\n",
"enclosed_starred_tuple = (*range(2), *\"23\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Note that:\n",
"- If the enclosed sequence has one term, there must be a comma after the term.\n",
"- The elements of a tuple can have different types.\n",
"- The unpacking operator `*` can unpack an iterable into a sequence in an enclosure."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"To create a list, we use square brackets to enclose a comma separated sequence of objects."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:29:55.099284Z",
"start_time": "2020-11-02T23:29:55.092488Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%mytutor -h 450\n",
"empty_list = []\n",
"singleton_list = [0] # no need to write [0,]\n",
"heterogeneous_list = [singleton_list, (1, 2.0), print]\n",
"enclosed_starred_list = [*range(2), *\"23\"]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"We can also create a tuple/list from other iterables using the constructors `tuple`/`list` as well as addition and multiplication similar to `str`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:31:26.431382Z",
"start_time": "2020-11-02T23:31:26.426487Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%mytutor -h 950\n",
"str2list = list(\"Hello\")\n",
"str2tuple = tuple(\"Hello\")\n",
"range2list = list(range(5))\n",
"range2tuple = tuple(range(5))\n",
"tuple2list = list((1, 2, 3))\n",
"list2tuple = tuple([1, 2, 3])\n",
"concatenated_tuple = (1,) + (2, 3)\n",
"concatenated_list = [1, 2] + [3]\n",
"duplicated_tuple = (1,) * 2\n",
"duplicated_list = 2 * [1]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Exercise** Explain the difference between following two expressions. Why a singleton tuple must have a comma after the item."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:31:48.052688Z",
"start_time": "2020-11-02T23:31:48.048349Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n",
"(3, 3)\n"
]
}
],
"source": [
"print((1 + 2) * 2, (1 + 2,) * 2, sep=\"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "singleton-tuple",
"locked": false,
"points": 0,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"`(1+2)*2` evaluates to `6` but `(1+2,)*2` evaluates to `(3,3)`. \n",
"- The parentheses in `(1+2)` indicate the addition needs to be performed first, but \n",
"- the parentheses in `(1+2,)` creates a tuple. \n",
"\n",
"Hence, singleton tuple must have a comma after the item to differentiate these two use cases."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to use a rule to construct a tuple/list?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-29T00:11:10.722819Z",
"start_time": "2020-10-29T00:11:10.718451Z"
},
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can specify the rule using a [comprehension](https://docs.python.org/3/reference/expressions.html#index-12), \n",
"which we have used in a [generator expression](https://docs.python.org/3/reference/expressions.html#index-22). \n",
"E.g., the following is a python one-liner that returns a generator for prime numbers."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:36:56.247594Z",
"start_time": "2020-11-02T23:36:56.233173Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97\n"
]
},
{
"data": {
"text/plain": [
"\u001b[0;31mSignature:\u001b[0m \u001b[0mall\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0miterable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m/\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mDocstring:\u001b[0m\n",
"Return True if bool(x) is True for all values x in the iterable.\n",
"\n",
"If the iterable is empty, return True.\n",
"\u001b[0;31mType:\u001b[0m builtin_function_or_method\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"all?\n",
"prime_sequence = lambda stop: (\n",
" x for x in range(2, stop) if all(x % divisor for divisor in range(2, x))\n",
")\n",
"print(*prime_sequence(100))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"There are two comprehensions used:\n",
"- In `all(x % divisor for divisor in range(2, x))`, the comprehension creates a generator of remainders to the function `all`, which returns `True` if all the remainders are non-zero else `False`.\n",
"- In the return value `(x for x in range(2, stop) if ...)` of the anonymous function, the comprehension creates a generator of numbers from 2 to `stop-1` that satisfy the condition of the `if` clause. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Exercise** Use comprehension to define a function `composite_sequence` that takes a non-negative integer `stop` and returns a generator of composite numbers strictly smaller than `stop`. Use `any` instead of `all` to check if a number is composite."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:36:33.954168Z",
"start_time": "2020-11-02T23:36:33.932818Z"
},
"nbgrader": {
"grade": false,
"grade_id": "composite_sequence",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4 6 8 9 10 12 14 15 16 18 20 21 22 24 25 26 27 28 30 32 33 34 35 36 38 39 40 42 44 45 46 48 49 50 51 52 54 55 56 57 58 60 62 63 64 65 66 68 69 70 72 74 75 76 77 78 80 81 82 84 85 86 87 88 90 91 92 93 94 95 96 98 99\n"
]
}
],
"source": [
"any?\n",
"### BEGIN SOLUTION\n",
"composite_sequence = lambda stop: (\n",
" x for x in range(2, stop) if any(x % divisor == 0 for divisor in range(2, x))\n",
")\n",
"### END SOLUTION\n",
"\n",
"print(*composite_sequence(100))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"We can construct a list instead of a generator using [list comprehension](https://docs.python.org/3/glossary.html#term-list-comprehension):"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[x ** 2 for x in range(10)] # Enclose comprehension by brackets"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"**Is the list comprehension the same as applying `list` to a generator expression?**"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(x ** 2 for x in range(10)) # Enclose comprehension by brackets"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"List comprehension is more efficient as it does not need to create generator first:"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1.99 µs ± 35.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"[x ** 2 for x in range(10)]"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.55 µs ± 317 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"list(x ** 2 for x in range(10))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Exercise** The following are two different ways to use comprehension to construct a tuple. Which one is faster? Try predicting the results before running them."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4.41 µs ± 772 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"tuple(x for x in range(100))"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.57 µs ± 156 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"%%timeit\n",
"tuple([x for x in range(100)])"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "generator-vs-tuple",
"locked": false,
"points": 0,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"The second method is often faster because the list of items can be created faster with list comprehension instead of generator expression. This benefits appear to out-weight the cost in converting a list to a tuple."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"With list comprehension, we can simulate a sequence of biased coin flips."
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:42:30.880408Z",
"start_time": "2020-11-02T23:42:30.832881Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Chance of head: 0.2976931267462506\n",
"Coin flips: T T T H T T T T T H T T H H T H H T H H T T T T T T T T T T T T T H T H T T H T T T T T T T T T H T T T T T T H T T H T H H T H T T T T H H T T T T T T T T T T H T H T H T T T T T H H H T H T H T T T T T T T T T T T T T T T T T H T H H T T T H T T H T H T T T H T T T T T H T H T T T T T T T T T T T T H T H T T T T T T T H T T T T H H T T H H T T H H H T T T T H T H T T T T T T H T T T T T T T T T T T T T T H T H H H T T H T T H T T T T H H T T T T T H T T H H T T H T H T H H T H T T H T T H T T T T H H T T T H T T T H T T T T T T T T H H T T T H T T T T H T T H H T T T T T T T H T H H T T H H H T H T T T T T T H T T T T T T H T H T T H T H T T T H T T T T H T T H T H T T T H T T T H H T T H T H H T T T T T T H H T H T T T H T T H H T T H T H T H T T T H T T T H H H T T T T T T T H T T H T T T H T T T T H T T T H T T T H T T T H T H T H T T H H T T T T T T H H T H H T T T T T H T T H T H T T H T T H T T H H T H H T H H H T H T T T T T T T H T T T H T H T H H H T T T T H H T T H T H T T T H H T H T H T T T T T H T H H T T T T H H T H H T T H H T T T T T H T T H T H T T H H T H T T H T H T T T T T H T T T H T T T T T T T H T H T T H H T H H T T T T T H T T T H H T T H T T H T T T T T H T T H H T T T H T T H H T T T T H H T T T T T T T H T H T H H T H H T H T T T H T T H H H T T T T T T H T T H T T T T T H T H T T T H T T T T T T H T T H H T T T H T T H T T H T T T T T T T T T H H T T T H T T T T T H H T H T H T H T T H H T T T T T T T T T T T H T T T T T T T T T H T T T H H H T T T H H T T T T H T T T T H T T H T T T T T H T T T T T T H T T H T T T T T H T H T H T T H T T T T T H T T T T T H T T T H T H H T H T H T T T T T H H T T T T T H T T T H H T H T H T H T T H T T T T T H T T T T H T H H T T H T T T T H H T T T T H T H H H H T T H T H T T T T T H T T T T H T T T T T T T T H H T T T T H T H H H T H T T H H T T T H H H T T H H T T T T T T T T T H T T H H T H T T T T T T T T H T H T T H T T H T H T T T T T T T T T H T H T T T H T H T H T H T T T T T T H T H T T T T\n"
]
}
],
"source": [
"from random import random as rand\n",
"\n",
"p = rand() # unknown bias\n",
"coin_flips = [\"H\" if rand() <= p else \"T\" for i in range(1000)]\n",
"print(\"Chance of head:\", p)\n",
"print(\"Coin flips:\", *coin_flips)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can then estimate the bias by the fraction of heads coming up."
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:43:05.198459Z",
"start_time": "2020-11-02T23:43:05.193224Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fraction of heads: 0.304\n"
]
}
],
"source": [
"def average(seq):\n",
" return sum(seq) / len(seq)\n",
"\n",
"\n",
"head_indicators = [1 if outcome == \"H\" else 0 for outcome in coin_flips]\n",
"fraction_of_heads = average(head_indicators)\n",
"print(\"Fraction of heads:\", fraction_of_heads)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Note that `sum` and `len` returns the sum and length of the sequence."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"**Exercise** Define a function `variance` that takes in a sequence `seq` and returns the [variance](https://en.wikipedia.org/wiki/Variance) of the sequence."
]
},
{
"cell_type": "code",
"execution_count": 140,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:43:51.901668Z",
"start_time": "2020-11-02T23:43:51.897232Z"
},
"nbgrader": {
"grade": false,
"grade_id": "variance",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"95% confidence interval: [0.27,0.33]\n"
]
}
],
"source": [
"def variance(seq):\n",
" ### BEGIN SOLUTION\n",
" return sum(i ** 2 for i in seq) / len(seq) - average(seq) ** 2\n",
" ### END SOLUTION\n",
"\n",
"\n",
"delta = (variance(head_indicators) / len(head_indicators)) ** 0.5\n",
"print(\"95% confidence interval: [{:.2f},{:.2f}]\".format(p - 2 * delta, p + 2 * delta))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Selecting items in a sequence"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to traverse a tuple/list?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Instead of calling the dunder method directly, we can use a for loop to iterate over all the items in order."
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:45:55.687173Z",
"start_time": "2020-11-02T23:45:55.681215Z"
},
"scrolled": true,
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1 2 3 4 "
]
}
],
"source": [
"a = (*range(5),)\n",
"for item in a:\n",
" print(item, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"To do it in reverse, we can use the `reversed` function."
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:45:16.488066Z",
"start_time": "2020-11-02T23:45:16.477429Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4 3 2 1 0 "
]
}
],
"source": [
"reversed?\n",
"a = [*range(5)]\n",
"for item in reversed(a):\n",
" print(item, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can also traverse multiple tuples/lists simultaneously by `zip`ping them."
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:46:12.766014Z",
"start_time": "2020-11-02T23:46:12.751946Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 4\n",
"1 3\n",
"2 2\n",
"3 1\n",
"4 0\n"
]
}
],
"source": [
"zip?\n",
"a = (*range(5),)\n",
"b = reversed(a)\n",
"for item1, item2 in zip(a, b):\n",
" print(item1, item2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to select an item in a sequence?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Sequence objects such as `str`/`tuple`/`list` implements the [*getter method* `__getitem__`](https://docs.python.org/3/reference/datamodel.html#object.__getitem__) to return their items."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can select an item of a sequence `a` by [subscription](https://docs.python.org/3/reference/expressions.html#subscriptions) \n",
"```Python\n",
"a[i]\n",
"``` \n",
"where `a` is a list and `i` is an integer index."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A non-negative index indicates the distance from the beginning."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"$$\\boldsymbol{a} = (a_0, ... , a_{n-1})$$"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:47:38.089722Z",
"start_time": "2020-11-02T23:47:38.080226Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)\n",
"Length: 10\n",
"First element: 0\n",
"Second element: 1\n",
"Last element: 9\n"
]
},
{
"ename": "IndexError",
"evalue": "tuple index out of range",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m/tmp/ipykernel_411/3903788463.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Second element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Last element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mIndexError\u001b[0m: tuple index out of range"
]
}
],
"source": [
"a = (*range(10),)\n",
"print(a)\n",
"print(\"Length:\", len(a))\n",
"print(\"First element:\", a[0])\n",
"print(\"Second element:\", a[1])\n",
"print(\"Last element:\", a[len(a) - 1])\n",
"print(a[len(a)]) # IndexError"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-27T14:55:28.986812Z",
"start_time": "2020-10-27T14:55:28.980088Z"
},
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"`a[i]` with `i >= len(a)` results in an `IndexError`. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A negative index represents a negative offset from an imaginary element one past the end of the sequence."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"$$\\begin{aligned} \\boldsymbol{a} &= (a_0, ... , a_{n-1})\\\\\n",
"& = (a_{-n}, ..., a_{-1})\n",
"\\end{aligned}$$"
]
},
{
"cell_type": "code",
"execution_count": 149,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:48:34.920475Z",
"start_time": "2020-11-02T23:48:34.906520Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n",
"Last element: 9\n",
"Second last element: 8\n",
"First element: 0\n"
]
},
{
"ename": "IndexError",
"evalue": "list index out of range",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Second last element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'First element:'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mIndexError\u001b[0m: list index out of range"
]
}
],
"source": [
"a = [*range(10)]\n",
"print(a)\n",
"print(\"Last element:\", a[-1])\n",
"print(\"Second last element:\", a[-2])\n",
"print(\"First element:\", a[-len(a)])\n",
"print(a[-len(a) - 1]) # IndexError"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-10-29T04:10:06.523676Z",
"start_time": "2020-10-29T04:10:06.517287Z"
},
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"`a[i]` with `i < -len(a)` results in an `IndexError`. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**How to select multiple items?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can use [slicing](https://docs.python.org/3/reference/expressions.html#slicings) to select a range of items as follows:\n",
"```Python\n",
"a[start:stop]\n",
"a[start:stop:step]\n",
"```\n",
"\n",
"The selected items corresponds to those indexed using `range`:\n",
"\n",
"```Python\n",
"(a[i] for i in range(start, stop))\n",
"(a[i] for i in range(start, stop, step))\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:49:23.393787Z",
"start_time": "2020-11-02T23:49:23.389376Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 2, 3)\n",
"(1, 3)\n"
]
}
],
"source": [
"a = (*range(10),)\n",
"print(a[1:4])\n",
"print(a[1:4:2])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Unlike `range`, the parameters for slicing take their default values if missing or equal to None:"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:49:36.191993Z",
"start_time": "2020-11-02T23:49:36.188102Z"
},
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1, 2, 3]\n",
"[1, 2, 3, 4, 5, 6, 7, 8, 9]\n",
"[1, 2, 3]\n"
]
}
],
"source": [
"a = [*range(10)]\n",
"print(a[:4]) # start defaults to 0\n",
"print(a[1:]) # stop defaults to len(a)\n",
"print(a[1:4:]) # step defaults to 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"The parameters can also take negative values:"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:49:59.510025Z",
"start_time": "2020-11-02T23:49:59.505499Z"
},
"scrolled": true,
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[9]\n",
"[0, 1, 2, 3, 4, 5, 6, 7, 8]\n",
"[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]\n"
]
}
],
"source": [
"print(a[-1:])\n",
"print(a[:-1])\n",
"print(a[::-1]) # What are the default values used here?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"A mixture of negative and postive values are also okay:"
]
},
{
"cell_type": "code",
"execution_count": 155,
"metadata": {
"ExecuteTime": {
"end_time": "2020-11-02T23:51:22.313831Z",
"start_time": "2020-11-02T23:51:22.308366Z"
},
"scrolled": true,
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[]\n",
"[1, 2, 3, 4, 5, 6, 7, 8]\n",
"[]\n",
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n"
]
}
],
"source": [
"print(a[-1:1]) # equal [a[-1], a[0]]?\n",
"print(a[1:-1]) # equal []?\n",
"print(a[1:-1:-1]) # equal [a[1], a[0]]?\n",
"print(a[-100:100]) # result in IndexError like subscription?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Exercise** (Challenge) Complete the following function to return a tuple `(start, stop, step)` such that `range(start, stop, step)` gives the non-negative indexes of the sequence of elements selected by `a[i:j:k]`.\n",
"\n",
"*Hint:* See [note 3-5 in the python documentation](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"nbgrader": {
"grade": false,
"grade_id": "sss",
"locked": false,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"def sss(a, i=None, j=None, k=None):\n",
" ### BEGIN SOLUTION\n",
" l = len(a)\n",
" step = 1 if k is None else k\n",
" m = l if step > 0 else l - 1\n",
" start = 0 if i is None else min(i if i > 0 else max(i + l, 0), m)\n",
" stop = l if j is None else min(j if j > 0 else max(j + l, 0), m)\n",
" ### END SOLUTION\n",
" return start, stop, step\n",
"\n",
"\n",
"a = [*range(10)]\n",
"assert sss(a, -1, 1) == (9, 1, 1)\n",
"assert sss(a, 1, -1) == (1, 9, 1)\n",
"assert sss(a, 1, -1, -1) == (1, 9, -1)\n",
"assert sss(a, -100, 100) == (0, 10, 1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Exercise** With slicing, we can now implement a practical sorting algorithm called [quicksort](https://en.wikipedia.org/wiki/Quicksort) to sort a sequence. Explain how the code works:"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[84, 89, 28, 26, 86, 95, 73, 44, 98, 60]\n",
"[26, 28, 44, 60, 73, 84, 86, 89, 95, 98]\n"
]
}
],
"source": [
"def quicksort(seq):\n",
" \"\"\"Return a sorted list of items from seq.\"\"\"\n",
" if len(seq) <= 1:\n",
" return list(seq)\n",
" i = random.randint(0, len(seq) - 1)\n",
" pivot, others = seq[i], [*seq[:i], *seq[i + 1 :]]\n",
" left = quicksort([x for x in others if x < pivot])\n",
" right = quicksort([x for x in others if x >= pivot])\n",
" return [*left, pivot, *right]\n",
"\n",
"\n",
"seq = [random.randint(0, 99) for i in range(10)]\n",
"print(seq, quicksort(seq), sep=\"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": true,
"grade_id": "quick-sort",
"locked": false,
"points": 0,
"schema_version": 3,
"solution": true,
"task": false
},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"The above recursion creates a sorted list as `[*left, pivot, *right]` where\n",
"- `pivot` is a randomly selected item in `seq`,\n",
"- `left` is the sorted list of items smaller than `pivot`, and\n",
"- `right` is the sorted list of items no smaller than `pivot`.\n",
"\n",
"The base case happens when `seq` contains at most one item, in which case `seq` is already sorted."
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"rise": {
"enable_chalkboard": true,
"scroll": true,
"theme": "white"
},
"toc": {
"base_numbering": 1,
"nav_menu": {
"height": "195px",
"width": "330px"
},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "454.418px",
"left": "1533px",
"top": "110.284px",
"width": "260.994px"
},
"toc_section_display": true,
"toc_window_display": false
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}